Structured soft margin confidence weighted learning for grapheme-to-phoneme conversion

نویسندگان

  • Keigo Kubo
  • Sakriani Sakti
  • Graham Neubig
  • Tomoki Toda
  • Satoshi Nakamura
چکیده

In recent years, structured online discriminative learning methods using second order statistics have been shown to outperform conventional generative and discriminative models in the grapheme-to-phoneme (g2p) conversion task. However, these methods update the parameters by sequentially using N -best hypotheses predicted with the current parameters. Thus, the parameters appearing in early hypotheses are overfitted compared with those in later hypotheses. In this paper, we propose a novel method called structured soft margin confidence weighted learning, which extends multi-class confidence weighted learning to structured learning. The proposed method extends multiclass CW in two ways, allowing for improved robustness to overfitting: (1) regularization inspired by soft margin support vector machines, allowing for margin error, and (2) update usingN -best hypotheses simultaneously and interdependently. In an evaluation experiment on the g2p conversion task, the proposed method improved over all other approaches in terms of phoneme error rate with a significant difference.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Grapheme-to-phoneme conversion based on adaptive regularization of weight vectors

The current state-of-the-art approach in grapheme-to-phoneme (g2p) conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass classification. However, it is known that the aggressive weight update method of MIRA is prone to overfitting, even if the current example is an outlier or noisy. Adaptive Regul...

متن کامل

Structured Adaptive Regularization of Weight Vectors for a Robust Grapheme-to-Phoneme Conversion Model

Grapheme-to-phoneme (g2p) conversion, used to estimate the pronunciations of out-of-vocabulary (OOV) words, is a highly important part of recognition systems, as well as text-to-speech systems. The current state-of-the-art approach in g2p conversion is structured learning based on the Margin Infused Relaxed Algorithm (MIRA), which is an online discriminative training method for multiclass class...

متن کامل

Comparison of Grapheme-to-Phoneme Conversion Methods on a Myanmar Pronunciation Dictionary

Grapheme-to-Phoneme (G2P) conversion is the task of predicting the pronunciation of a word given its graphemic or written form. It is a highly important part of both automatic speech recognition (ASR) and text-to-speech (TTS) systems. In this paper, we evaluate seven G2P conversion approaches: Adaptive Regularization of Weight Vectors (AROW) based structured learning (S-AROW), Conditional Rando...

متن کامل

Optimizing phoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition

In this report, we present the results of further research on phoneme-to-grapheme (P2G) conversion for Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, in large vocabulary speech recognition. First, we summarize the results of previous research, and then we start with reporting on several optimization strategies for the Machine Learning technique we used to carry out P2G co...

متن کامل

Grapheme-to-Phoneme Conversion Associative Rules for K

In this paper, we describe a method for automatically extracting grapheme-to-phoneme conversion rules directly from the transcription of speech synthesis database and introduce a weighted score and jamo similarity to overcome the rule application difficulties. We make a structured rule tree by rule pruning and rule association, and can eliminate most of the rules with almost no decrease of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014